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Entanglement distillation refers to the task of transforming a collection of weakly entangled pairs 
into fewer highly entangled ones. It is a core ingredient in quantum repeater protocols, needed to 
transmit entanglement over arbitrary distances in order to realise quantum key distribution schemes. 
Usually, it is assumed that the initial entangled pairs are i.i.d. distributed and uncorrelated with each 
other, an assumption that might not be reasonable at all in any entanglement generation process in¬ 
volving memory channels. Here, we introduce a framework that captures entanglement distillation in 
the presence of natural correlations arising from memory channels. Conceptually, we bring together 
ideas from condensed-matter physics - that of renormalisation and of matrix-product states and op¬ 
erators - with those of local entanglement manipulation, Markov chain mixing, and quantum error 
correction. We identify meaningful parameter regions for which we prove convergence to maximally 
entangled states, arising as the fixed points of a matrix-product operator renormalisation flow. 


It has long been noted in the field of quantum information science that entanglement consti¬ 
tutes the key resource in various information processing and specifically communication tasks HI. 
Secure quantum key distribution necessarily relies on entanglement, even in prepare and measure 
schemes M- A central goal in quantum information science has been the development of tech¬ 
niques to transform less useful forms of entanglement into more suitable ones, and enhancing our 
understanding of the laws governing the manipulation of entanglement. The task of entanglement 
distillation specifically captures the resource character of entanglement, in that it aims at prepar¬ 
ing maximally entangled states from noisy or less entangled ones HJ. The concept of distillable 
entanglement grasps the maximum rate at which this is asymptotically possible, starting from a 
collection of many identically prepared quantum systems - and is hence of profound interest in 
the conceptual foundations of the field. Distillation steps are part of quantum repeater protocols 
EH2, necessary to distribute entanglement over arbitrary distances using noisy quantum channels: 
In such a scheme, entanglement is being established between different repeater stations and trans¬ 
ferred to the final designated nodes via suitable entanglement swapping steps. Distillation schemes 
thought of in this context are often iterative schemes, such as the recurrence protocol JH [Sj] and 
deterministic protocols based on error-correction codes ITT] [91 4TT1 , While iterative schemes do not 
achieve the maximum rates set by the distillable entanglement, they require less sophisticated and 
more practically feasible operations. 

The silent assumption in almost exclusively all of the proposed distillation schemes, however, 
is that the initial resources have been identically prepared and show no correlations whatsoever. 
While this is surely a good assumption in many preparations, it might not be reasonable at all in 
others. Whenever memory effects or channels lfl2tiT4l are involved, one expects some correla¬ 
tions between the involved entangled pairs, going beyond an i.i.d. setting. These correlations are 
expected to decay rapidly over several pairs sent through a channel - reflecting the natural corre¬ 
lation structure arising from a memory channel (see Fig. |T|). This may even be a desired feature, 
if resetting the system may be too elaborate or too costly in resources of time. The mathematical 
definition of distillable entanglement in the presence of correlations has been developed lfT5lfl6l . 
Yet, the important practical problem of distilling entanglement from correlated pairs arising from 
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quantum memory channels is still wide open. 

In this work, we propose a conceptually novel way forward to solve this problem, bringing 
together ideas from entanglement theory with those of condensed matter theory, specifically of 
renormalisation techniques and tensor network methods.We start by identifying the natural class 
of states arising from preparations and memory channels as bi-partite matrix-product operators 
(MPO) MM- Such classes of states are usually considered in the condensed matter context to 
capture thermal many-body states or those arising from open systems dynamics 117714201 . Here, 
we encounter natural bi-partite instances thereof. Entanglement distillation is then identified as a 
renormalisation of bi-partite matrix-product operators. The methods are inspired by and derive 
from renormalisation li2ll of matrix-product states 12214271 , again from many-body theory. 

Specifically, we show that both the recurrence protocol and an error correction based protocol 
ifTl converge to pairs of maximally entangled pure states for suitably correlated pairs which are 
naturally described by an MPO. This leads to entanglement distillation very similar to the i.i.d. 
case; surprisingly, allowing for principally unwanted correlations between subsequent pair's can 
even speed up the convergence to maximally entangled pairs compared to the uncorrelated i.i.d. 
case. We discuss a simple physical example where this is the case for a large parameter region of 
initial fidelities and correlation strengths. 


Setting and formalism 

We consider a sequence of L pairs of qubits, where two parties (say Alice and Bob) each hold 
one qubit from each pair. The focus on qubits is set for simplicity of notation only, it is clear 
that the same framework can be applied to systems of other physical dimension. These pair's are 
entangled, as well as correlated with each other, as a consequence of the preparation procedure 
involving stationary quantum memory effects. A natural preparation exhibiting such a memory 
involves an auxiliary quantum system C of some dimension d that embodies all the degrees of 
freedom of the memory. The state is then prepared in a sequential fashion, with the memory 
unitarily interacting with the first entangled pair, then the second, and so on fl~3l l23l 1281 . A 
state generated in this way is given by a matrix-product state, if it is pure, or a matrix-product 
operator in case of noisy mixed states IIT7UT81I . as they are considered here, with d taking the role 
of the bond dimension. The decay of memory effects in the distance between the entangled pairs 
naturally emerges in this construction. We introduce here how naturally correlated bi-partite MPO 
arising from this setting. 

More specifically, we work in a numerically indexed Bell basis (|0i) , \(t>f) , 0:j) , \ more 
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FIG. 1. (a) Source with memory emitting weakly correlated photon pairs and (b) the MPO p naturally 
describing this setting. 
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commonly labelled as (|</> + ), \<f>~) , \^ + ), IV’ - ))- We consider a sequence of L pairs of qubits, 
with basis vectors |<1> X ) = \<f> Xl ) d ) X2 ) • • • | <f>x L ), where Alice holds the first qubit of each pair and 
Bob holds its partner. Translationally invariant mixed states reflecting stationarity of the source 
are described in the MPO language as 

($ x | p |<f> y ) = TV [M xi,yi M X2,y2 ... M XL ’ yL ]. (1) 

Purely for simplicity of notation, we take periodic boundary conditions here. The dimension of the 
matrices M x,y 6 <C dxd , x. y (= {1, 4} limits the coiTelations between pairs, and by increasing 
this bond dimension d, arbitrary quantum states can be described in this formalism. There is a 
gauge freedom in our choice of MPO matrices as for any invertible S, mapping 

M x,y i —y SM^S- 1 (2) 

will give an alternative description of the same physical state. Generally, 16 matrices are needed 
for the description of each pair, reflecting the two-particle density matrix. However, without loss 
of generality we can assume each M x,y to be Bell diagonal. If the state was not Bell diagonal 
originally, it can be brought into this form by a suitable local group twirl over the Pauli group (T[. 
Since the employed protocols only make use of Clifford operations, the group twirl will commute 
with these operations, so that it can be implemented at the very end or merely at the level of 
classical data processing. For this reason we use the shorthand A = M 1,1 , B = M 2,2 , C = M 3,3 
and D = M 4,4 . Without loss of generality, we consider the distillation of maximally entangled d) + 
pairs. The “A” matrix will be the dominant matrix, while the others we will call noise matrices. 


Protocols and renormalisation 

An N —> M iterative protocol for entanglement distillation of i.i.d. states will act on N pairs 
at a time and output M (where M < N ) pairs. For more than N i.i.d. pairs, the protocol is 
performed in parallel on blocks of N pairs. In the MPO setting pairs are not i.i.d. and so we 
must specify which pairs are involved in each block of a protocol. We choose neighbouring 
pairs so the first N pairs arc distilled into M pairs, while simultaneously the next N pairs are 
distilled, and so on. This natural choice has the practical merit of respecting locality, and has 
the additional advantage that the output state is easily shown to again be an MPO of the same 
bond dimension (see Fig. [2]). Every iteration of the distillation protocol now acts as a map from 
an MPO on one scale to the subsequent one and reducing the chain length from L to LM/N. 
After each step, a positive MPO is retained ||29]| . Indeed, it can be naturally seen as process of 
MPO renormalisation , this being a mixed-state and bi-partite analogue of the renormalisation of 



FIG. 2. Renormalisation schemes of MPO, (a) mapping N pairs to M in an N —> M scheme, (b) In the 
2 —> 1 recurrence protocol two neighbouring MPOs are conjugated with a local unitary and subjected to a 
measurement. Contraction of the tensor network leads to the MPO at the subsequent scale. 
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matrix product states discussed in Ref. I2T1 . After the n-th step, we label the MPO operators 
{A n , B n , C n , D n }, where the initial raw state provides the n = 0 matrices. We prove several 
results on convergence to entangled states which show the functioning of the schemes; proofs that 
can also be interpreted as convergence proofs for renormalisation flow of the MPOs. Specifically 
we consider the recurrence protocol and a distillation through error correction using the 5-qubit 
code. Both protocols rely solely on Clifford operations and use local operations and classical 
communication with respect to the partition between Alice and Bob. Intuitively, one can say that 
in many practically relevant settings, the entanglement and correlations between pairs are being 
“renormalised” into more useful entanglement shared between Alice and Bob, to be employed in 
subsequent key generation. 


Recurrence protocol 

The recurrence protocol is a 2 —>• 1 iterative protocol which uses post-selection. At every round 
measurement outcomes are being produced and we only proceed if certain outcomes are obtained. 
Here we use a slightly improved version |[8l of the recurrence scheme If30l . Cast into the MPO 
language, the iteration formula after two steps is 


A n+2 = {Al + Blf + {Cl + Dl)\ 

(3) 

B n +2 = {A n , B n } 2 + {C n , D n } 2 , 

(4) 

C n+ 2 ={A*+B*,(% + D*}, 

(5) 

D n -y .2 = {{A n , B n }, {C n , D n }}, 

(6) 


where curly brackets denote the anti-commutator. After two steps, the matrices are being re¬ 
gauged and rescaled. Replacing matrices by commuting scalars recovers the original i.i.d. result. 
Proving convergence of the initial MPO in the state p to the maximally entangled state is 
achieved by showing that the noise matrices vanish exponentially faster than the A n matrices. 
Specifically, an appropriate ratio of norms will be exponentially suppressed. We define a norm 
||M|| in terms of a channel M. Choi-isomorphic to M, so that \\M\\ - \\AA II l—>i, where we use 
the induced “1-to-l” norm OTTl . We introduce the noise contribution of the coefficient matrices 
B n , C n , and D n as 


£n — max ( \\B n ||i—, ||C n ||i_>.i, ||Z? n ||i_>.i). 


(7) 


Due to norm sub-multiplicativity, one finds initially small eo entails e n vanishes with n. However, 
ensuring A n+2 stays large is difficult. To do so, we shall adjust the MPO gauge after two steps, re¬ 
gauging this using a suitable gauge transformation and re-scaling, so that An +2 is trace-preserving 
and hence ||-4 n +2||i->i = 1. To quantify how much the gauge transformation changes the matrix 
norm, we rely on the ergodicity coefficient of the matrices, 


t(M) 


max 

Tr[cr]=0 


\\M (g) lit 

IM|i 


( 8 ) 


which allows a quantification of how rapidly a channel mixes input states into the channel’s 
stationary state. We are interested in the ergodicity of A n , for which we use the shorthand 
T n := r(A„). We are now ready to state the first main result, which provides sufficient conditions 
for entanglement distillation using the recurrence protocol. 
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FIG. 3. Region of convergence for the recurrence protocol. The area under the blue line is the region fulfill¬ 
ing the conditions given in Theorem 1. The green region is a slightly improved bound that can be obtained 
with computer assistance, but for which we have no closed form expression. The blue dot corresponds to 
our physical example. 


Theorem 1 (Convergence in the recurrence protocol). Given a translationally invariant Bell diag¬ 
onal MPO with coefficient matrices Aq, Bq,Cq, and Dq, the iterative application of the recurrence 
protocol leads to convergence to uncorrelated pairs in the maximally entangled state 6+ for 


eo < 


1 1 —T 0 4 
71 + 7-4' 


(9) 


The convergence is illustrated in Fig. [3] The proof is presented in full length in the appendix. 
To be self-contained, we will sketch it here, and provide significant intuition. 

To prove convergence, we need to show that the noise matrices go down exponentially fast, 
while A n stays large. The first part can be shown by taking into account a double step of the 
protocol after which all norms of the noise matrices are at least of order e^. This can be shown via 
sub-additivity and sub-multiplicativity of the 1-to-l norm, 


e n+ 2 <4(1 + e 2 n ) e^. 


( 10 ) 


Flowever, to ensure the contribution of the dominant matrix stays large, our approach is to regauge 
so that A n + 2 is trace-preserving. A channel A n is trace-preserving if and only if its Hilbert- 
Schmidt adjoint, the dual channel An (the Heisenberg representation of A n ) satisfies A), (1) = 
1. When instead „4n(£ n ) = \ n f n (where \ n is the largest such eigenvalue), then the Perron— 
Frobenius theorem ensures f n is Hermitian. The gauge transformation S n = \fff, ® and the 
re-scaling by A" 1 will recover trace-preservation, provided f n is invertible. This transformation 
potentially increases the norm of the noise matrices. Using sub-multiplicativity, we find that 


£n+ 2 H► A n j_ 2 K (£71+2) €n+ 2 , (11) 

where n(S n ) = ||S n ||l— klIIS^II !_>! is the condition number of S n . We wish to show that A n +2 
stays close to trace-preserving to keep the condition number small. We show that 


1 


1 - 2 kn 


kn — 


1 + T 4 

1 — A 

- 1 - I nr, 


(4Cn + 1 0e n) • 


k(«5„) < 


( 12 ) 
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The proof bears similarities to the perturbation of the steady state of a trace-preserving quantum 
Markov chain 11321 . which also depends on the ergodicity coefficient of the transition matrix. The 
basic intuition is that if A n is a rapidly mixing channel, with small r n , then A 4 is also rapidly 
mixing. Before making our gauge transformation A n+ 2 is a sum of A 4 and some small noise 
matrices. The more rapidly mixing a channel, the more its eigenstates are robust against the 
perturbative addition of noise matrices. We desire the dual eigenstate f n +2 before the gauge to 
stay close to 1, which we expect provided for rapid mixing (small r n ) and low noise (small e n ) as 
we show rigorously in a spirit similar to the ideas of Ref. l33l . 

Although ergodicity is not a matrix norm, it has similar properties such as sub-additivity and 
sub-multiplicativity, from which one can derive an upper bound on r n+ o in terms of r n and e n so 
that r n+ 2 < t 4 + f(e n , T n ) where / is an slight correction that vanishes as e n H> 0. This occurs 
because a double step of the protocol raises the ergodicity coefficient of A to the fourth power, but 
regauging the MPO results in the adjustment /. The essential point is that we have bounds on the 
pair (e n + 2 , T n + 2 ) in terms of (e n , r n ). From these it is straightforward to numerically determine 
the region of convergence to the fixed point (0,0), which we show in Fig. ([3]). Also shown in this 
figure is an analytic curve, for which we show (e n , r n ) (0, 0) without numerical aid. Finally, 
convergence in MPO operators again entails convergence of the density matrix. 


5-qubit protocol 

For every error correcting code that encodes k qubits into N physical qubits, there exists an 
iterative N —>• k one-way entanglement distillation protocol (2- In these protocols, noise informa¬ 
tion is extracted by local measurements, but instead of post-selecting when errors are detected we 
attempt to correct them by determining the smallest weight error consistent with the measurement 
data. The advantage over the recurrence protocol is that this protocol is deterministic, and that 
one-way distillation schemes require much less classical communication. In particular, we con¬ 
sider one-way entanglement distillation using the 5-qubit code (so a 5 —> 1 protocol), which is the 
smallest known code capable of correcting any single qubit error. We again find a closed form for 
the map acting on the coefficient matrices in each iteration, though we omit it here as each expres¬ 
sion contains 4 4 terms. We further introduce the transfer matrix E„ = A n + B n + C n + D n with 
corresponding channel £ n , as it has useful properties over the course of the iterations in a deter¬ 
ministic protocol, and state the following theorem. 

Theorem 2 (Error correction). Given a translationally invariant Bell diagonal MPO with coeffi¬ 
cient matrices Aq, Bo, Co, and Do, the iterative application of the 5-qubit error correcting code 
leads to convergence to uncorrelated pairs in the maximally entangled state for eo < 1/ 33. eo 
is defined as above in the gauge where £0 is trace-preserving. 

Again, the full proof is provided in the appendix. Similar to the post-selective case, we show 
that with a growing number of iterations n, the contribution of the dominant matrix A n grows 
exponentially faster than the contribution of the noise matrices. In the deterministic case, we 
can use our gauge freedom to make £ n trace-preserving. Since we do not post-select, we do not 
need to renormalise in every round. In the 5-qubit error correction code, the transfer matrix is 
always mapped to its fifth power, E n+ i = Ef. Thus, if we initially choose a gauge where the 
corresponding map £q is trace-preserving, the transfer matrix will keep this property over the 
course of the iteration. Using combinatorical arguments, we then prove that for suitably small eo, 
e n converges to zero, entailing convergence in fidelity of the physical state. 
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Numerical studies and physical Hamiltonians 


To complement the rigorous and analytical results presented above, we have also performed 
numerical studies on randomly drawn matrices. These results demonstrate that a state within 
the distillable region is guaranteed to converge, but several steps in our proof make worst case 
assumptions and so we may also find convergence for many MPO states outside this region. In 
a worst case scenario, correlations are always pernicious. However, our numerics indicate that 
in many strongly correlated chains the correlations can also be beneficial and enable distillation 
at noise levels well above the rigorous threshold as well as in cases where the protocols do not 
converge for uncorrelated i.i.d. states. 

It should be clear that many physical Hamiltonians reflecting interactions with memory chan¬ 
nels in sequential preparation procedures are covered by our results. To present a paradigmatic 
example, we compare the performance of the recurrence protocol on (i) perfect memoryless i.i.d. 
distributions of 4> + states with (ii) sequentially prepared states with implemented memory. For the 
preparation of the memory, we prepare uncorrelated Werner states, which then undergo a unitary 
interaction U(t, J) = exp (itH) on Bob’s side, where the Hamiltonian H is given by 


H(J) = J(X®X + Y®Y + Z®Z) + Z®t + t®Z. (13) 


We further implement a de-phasing channel to make the memory forgetful. We will discuss this 
example in further detail in the appendix. We identify parameter regions where the correlated 
states remarkably outperform the uncorrelated i.i.d. states regarding the speed of convergence. 
Also, distillation is possible for a larger range of initial fidelities. We are able to use the unwanted 
correlations from the memory to enhance the distillation, see also Fig. [3] 


Perspectives 

In this work, we have introduced a framework of renormalising entanglement in order to 
achieve entanglement distillation in the presence of natural correlations. We have proven that 
protocols known to work for i.i.d. pairs above a threshold fidelity also give rise to feasible entan¬ 
glement distillation. We have identified criteria to ensure convergence of correlated pairs described 
by an MPO to a number of independent maximally entangled pure states. On intuitive grounds, 
one might expect that if the MPO is only weakly correlated between the pairs and the reduced 
density matrix of a single pair is sufficiently close to a Bell pair, the distillation protocols should 
behave similarly to the i.i.d. case. Indeed, convergence can be proven for threshold fidelities and 
conditions on the correlation between the pairs. The programme initiated here shows that corre¬ 
lations are not necessarily a disadvantage, and one does not have to aim at de-correlating pairs 
or resetting preparation procedures, steps that will take time and will in practice lead to further 
entanglement deterioration. This work shows that such correlations can be largely renormalised 
away, with no modification to the schemes applied. We hope that this work triggers further studies 
on entanglement distillation and repeater protocols in the presence of realistic memory effects, as 
well as of further studies of renormalising matrix-product operators. 
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I. 5-QUBIT PROTOCOL 


We now prove a threshold for successful purification for the 5-qubit error correcting code used 
as an entanglement distillation scheme. Instead of giving a lower bound on the fidelity, we give an 



upper bound on the infidelity, or the probability of measuring <i> , ip + or <i> for a pair of qubits. 
Proof. For we have 
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p(f ) = Tr [p <8 l)] = 


TV [BE 


L -11 


Tr [E L ] 


(14) 


where the 1 acts on all other pairs of qubits. Similar expressions hold for f> + and by replacing 
B with C or D, respectively. We will find an upper bound of this expression in terms of the 
channel norm of the associated observable transfer matrices B, C, and D. We assume locally 
purifiable MPO’s which allows us to make use of an isomorphism between the MPO matrices and 
completely positive maps. Specifically, we define a norm ||M|| in terms of a channel M. Choi- 
isomorphic to M, so that \\M\\ - and we use the induced “1-to-l” Schatten norm. See 

Ref. lf3ll and App. Ill A| for more on norms. We call £ the channel Choi-isomorphic to the transfer 
matrix E. Our Lemma[T2]of the appendices proves that, assuming £ is trace-preserving, we have 


Tr[p (0- <S) 1)] < \\B\\ 


l-S-l- 


1 + d 5 / 2 r (£) 

1 - d 5 /2 r ( £ y 


L-l 


(15) 


Proving convergence of the initial MPO in the state p to the maximally entangled state (f> + is 
achieved by showing that the noise matrices vanish exponentially faster than the A n matrices. 

One step of the protocol takes 5 pairs as input and returns one pan - as output, correcting all 
zero and one qubit errors and thus reducing the error probability to at least quadratic order in e n . 
As discussed before, the code applies without post-selection, which means that the state does not 
have to be renormalised. As the length of the chain L n is divided by five in every step, the new 
transfer matrix E n+ \ is simply the fifth power of the previous transfer matrix E n . So if we start 
with a trace-preserving transfer channel, it remains trace-preserving. Thus, the renormalisation 
factor, derived in Lemma |TT| is constant, 


E n +1 — E n 


En = Pq 


(16) 


and 


En +1 — 


Lr. 


En — 


Ln 


(17) 


By sub-multiplicativity of the ergodicity, we have 


r(£n)n < T (To) 5 " 5° = r (To) 


\L 0 


(18) 


This already gives us an upper bound for the normalization for every step in the iteration just 
from the initial length of the chain and ergodicity. The second step is to upper bound || i_^i, 
||C n ||i_>.i and ||2\||i->i- We introduced, as a measure of the noise, the quantity 


= max{||B n ||i_>i, || C, 


n|| 1—>1? 


in, 


n|| 1—>1 


} 


(19) 


defined in a gauge and scaling, where £ n is trace-preserving. This gauge and scaling stays con¬ 
stant over the iteration. The complete update rules can be derived following the methodology of 


Sec. IIA We obtain a closed form for the map acting on the coefficient matrices in each iteration, 
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though we omit it here as each expression contains 4 4 terms and is not very insightful. Rather, we 
present just the leading terms here. 


A n +2 = A^ + Af^Bn + A^Cn + A^Dn + . . . , ( 20 ) 

B n + 2 = AlBl + A^C n D n + A^DnCn + ..., (21) 

Cn+2 = A^BnCn + A^CnBn + A^D^ + . . . , (22) 

D n +2 = A^BnDn + A^Cn + A^DnBn + . . . (23) 


Furthermore, we do not need the explicit iteration rule but only the norm of it. Each term is a 
product consisting of A n and some noise matrices. We can upper bound the channel norm by 
using sub-multiplicativity and sub-additivity and the definition of our noise measure, 


max{||£> n _|_i|| 1 -s.i, ||C n+ i||i_>.i, ||D n+ i||i_>.i} (24) 

< 30e^||^l n ||i^.i + 70e^||^4 n ||i_ ) .i + 90e 4 ||Ai||i->.i + 66e^. 

The noise terms are at least of quadratic order. The norm of A n can be easily upper bounded using 
the properties of channels and channel norms ||.4.„||i_>i < ||£ n ||i_»i = 1. The ensuing iteration 
for the noise measure is thus 


f-n+i ^ 30e^ + 70e^ + 90e 4 + 66e^. (25) 

Clearly, for errors to reduce e n must be at least smaller than 1/30. In this regime, we have 


2 7 2 1 2 11 2 


en+t < 30e n + -e n + -e n + —e n -l3Q + - + - + —)e n < 


7 1 11 \ 


33e 4 


(26) 


So we can be sure the iteration converges if eo < ~ 0.0303. Numerically, we find that 

eo < 0.031. The speed of the convergence is double exponential in the number of rounds, 


en < 33 (33e 0 ) 2 . (27) 

Making use of Eq. ( p~5] > and Eq. ( fi~8j ), we find that infidelity is bounded by 

p(r) < - (33e 0 ) 2n 1 + d5/2r ( g o) L °~\ (28) 

M 33^ 1 -d^T(£ 0 ) Lo 

Since |r(£o)| < 1, we have in generic cases that in the limit of an infinite chain 

p(<T) < ^ (33e 0 ) 2 " , (29) 

and it converges whenever ||i3o||i-ri> ||Co||i-ri> ||X>o||t—»-i < which ends the proof. □ 


It has hence been shown that we can formulate a threshold depending on the 1-to-1-norm of 
the three noise channels. This concludes our proof of a threshold for the five-qubit error correcting 
code. 
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II. RECURRENCE PROTOCOL 


We now turn to the post-selected recurrence protocol. It turns out that the proof is significantly 
more sophisticated since we can not rely on the trace-preserving property of the transfer channel. 
This requires a refined approach that involves deriving a perturbation bound for the left Perron 
vector of a quantum channel, using ideas of Markov chain mixing. 


A. Computational vs. Bell basis 

The recurrence protocol can be broken up into two steps performed locally by Alice and also 
Bob. So Alice (and also Bob) divide their qubits up into adjacent pairs within the MPO chain. 
For each pair the isometry K = |0) (0, 0| + 11) (1,1| is applied. This can be implemented using a 
local CNOT, measurement, post-selection and disposing of the measured qubit. The second step 
consists of a Hadamard rotation, again by both Alice and Bob. It is easy to verify that after the 
first step we again get an MPO with the same bond dimension (see Fig.|T|) such that 

X x,y i-> ( X x ’ y f. (30) 

where the X is the MPO operator in elements of the computational basis of 2 qubits, and so 
x, y £ {(0, 0), (1, 0), (0,1), (1,1)}. Indeed, this is direct analogous to the i.i.d case where the 
density matrix elements map as p x . y i-)- p 2 ry , though of course only in the computational basis. 
The phase noise is dealt with by the second step. The bilateral Hadamard operation effectively 
swaps bit and phase flip noise so that both are dealt with. In the computational basis the bilateral 
Hadamard operation is unwieldily so we switch to the Bell basis (see Fig. [4]). For the Bell basis 
we use the following shorthand 

A = M 1 ' 1 = X (0 ’ 0) ’ (0 ’ 0) + x (u) ’ (0 ’ 0) + + J 3 K-(°.°).( 1 . 1 ) j 

B = Af 2 ’ 2 = x (0 ’ 0) ’ (0 ’ 0) - X (u) ’ (0 ’ 0) - xC 0 ’ 0 )^ 1 ’ 1 ) + 

C = Af 3 ’ 3 = A^ 0 ’ 1 ^ 0 ’ 1 ) + jd 0 ’ 1 ^ 1 ’ 0 ) + A (1 ’ 0) ’ (0 ’ 1) + x (1 ’ 0) ’ (1 ’ 0) , 

D = Af 4 ’ 4 = jd 0 ’ 1 ^ 0 ’ 1 ) - x^ 0 ’ 1 ^ 1 ' 0 ) - A^ 1,0 ^ 0,1 ) + x (1 ’ 0) ’ (1 ’ 0) , 

which only defines the MPO operators for the diagonal elements in the Bell-basis, but we will see 
that they are decoupled from other elements and so it is sufficient to consider these alone. In this 


(31) 

(32) 

(33) 

(34) 



FIG. 4. A bi-partite matrix product operator (MPO), expressed in the Bell basis. 
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basis the recurrence procedure implements 


A n i-a AL 2 + B 2 i->- A 2 + B 2 = A n+ 1, 

(35) 

Bn {A n , B n } l y C~ + D 2 = B n+ 1, 

(36) 

C n e- > C 2 + D n i— > {A n , B n } = C n+1 , 

(37) 

D n 1 ^ {Cm Dn} 1 ^ {Cm D n } = D n -\-\, 

(38) 


where each of the two steps are shown, and a subscript n is introduced to denote the MPO after 
n iterations. The brackets {-, ■} denote the anti-commutator. For matrices that are simply scalars, 
where B n , C n , D n ~ e n and A n ~ 1, we see B n+ 1 , D n+ \ ~ 2e 2 but C n +i ~ O(e). This occurs 
because in a single round only one type of noise is decreased, and so to see an overall e 2 error 
reduction we must consider two rounds of iteration 


A n +2 

= (Al + Blf + (Cl + Dl)\ 

(39) 

Bn +2 

= {A n , B n } 2 + {C n , D n } 2 , 

(40) 

0,1+2 

= {Al + BlC 2 + D 2 }, 

(41) 

Dn +2 

= {{A n , B n }, {C n , D n }}. 

(42) 


Now treating the matrices as scalars, we see B , C. D all go from size 0(e) to 0(e 2 ) or smaller. 
This is the intuition from the i.i.d. case and we next turn to making this rigorous by quantifying 
this size with appropriate matrix norms. 


B. Detailed outline of convergence proof for recurrence protocol 


In the main text we presented a sketch of the proof of Theorem [T] We first recap this sketch, 
tilling in some details, and clearly stating required lemmas. Numerous technical tools relating 
to norms and the ergodicity coefficient are covered in Sec. Ill A The first step is showing the 
iterative formulae relating MPO operators after n + 1 distillation rounds as a polynomial of MPO 
operators after n rounds, as introduced in Eq. [42] We introduce a prime in the outcomes of 

1/ O/ 

1 n+2’ D n+2’ 


Eq. <l42l). A' n 


n> v)' 

2’ ( ~ / n+2’ 1J n+2’ 


and we introduce the noise measure 


e n = max 


(Mi-ijc; 


l-rl, 


I V' 


l-rl 


(43) 


In a second step, since the channel A' n+2 is not trace-preserving, we make use of the following 
lemma. 


Lemma 1 (Conjugation of quantum channels). Let T he a completely-positive map, with largest 
eigenvalue A and B^(C) = A£, f; = Let S be a channel mapping S(p) = ^ 1 / 2 p^ 1 / 2 . If £ is 
invertible, S 1 exists and S o Jo is trace-preserving. 


We give further steps of the proof in Sec. IIIC Applying the lemma to 4 , n+2 , gives a gauge 


transform we herein label S n . More specifically, we use 


An +2 = ^n+2^n+2A n+2 S n+2 , 

(44) 

B n +2 = ^ n +2^n+2B n+2 S n+2 , 

(45) 

C n +2 = ^ n +2^n+20 n+2 S n+2 , 

(46) 

D n +2 = A n _|_2'S'n+277 n _|_2*S' n _|_2- 

(47) 
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The matrix norms are not gauge invariant and much of the proof centres around obtaining an upper 
bound on 


Ui — max (11111 —>.i, \\C n ||i—, ||22 n ||i_>i), (48) 

in this gauge where A n is trace-preserving. From sub-multiplicativity, we know that e n+ 2 = 
K(X n+ 2)\-+. 2 e n +2 ar| 4 as argued in the main text, we obtain a recursive relation 

e n +2 < 4 K,(S n ) (l + e^) e \ (49) 

where 

n(S n ) = ||<S n ||i-n H^S” 1 1| 1^1 (50) 


is the condition number of S n and where we omitted A, as A > 1. A substantial amount of our 
technical work goes into proving the following lemma. 


Lemma 2 (Condition number). The condition number of n{X n ) where X n is our gauge change, 
is upper bounded by 

n{S n ) < (1 - 2k n )~ l with k n = ] + 1 ~ n A (4e^ + 10e 4 ) . (51) 


1 — T 4 
1 ' n 

This we prove in Sec. Ill D[ introducing Theorem [6j an eigenvalue perturbation theorem, which 
we prove in Sec. Ill E Notice that the bound on the condition number depends on both r n and e n , 
and in turn e n +2 is now upper bounded by a function of only e n and r n . We already understand 
the iterative behaviour of e n , but not of r n . In Sec. |III F| we show that 

Lemma 3 (Upper bound to ergodicity coefficient). The ergodicity coefficient A n obeys 

1 + A ra 


T n + 2 < (1 + A n ) + 


1 - A r 


(A n + 2e^ + 5e 4 ) 


(52) 


where 


A n = k n /( 1 - k n ) (53) 

with k n as in Lemma [57] 

Notice that r n+ 2 depends only on r n and e n . Therefore, these two lemmas provide a pair of 
coupled equations that provide (e n + 2 , Tn+ 2 ) as a function of ( e n , r n ). Therefore, it is straightfor¬ 
ward to numerically study the initial conditions (eo, To) flow towards the desired point (0,0), and 
this is presented in Fig. [3] These results show that a state within this distillable region is guaran¬ 
teed to converge, but many steps in our argument make worst case assumptions and so we may 
also find convergence for many actual MPO states with (eo, to) outside this region. 

The precise shape of the distillable region is difficult to characterise analytically. However, we 
can analytically prove convergence on a slightly smaller region. 


Lemma 4 (Proof of convergence). Given the iterative formulae for upper bounds on e n and T n , 
we know that (e n ,T n ) —> (0, 0) whenever 


eo < min - 


11 Tn 


7 1 + T n 4 


(54) 


We give a full proof in Sec. Ill G We have shown that the MPO noise matrices vanish, while 
the A n remains trace preserving and so of constant norm. As e n —> 0 we have A n —> £ n and we 


can conclude convergence in fidelity as follows from Lemma 12 
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III. MATHEMATICAL CONCEPTS AND PROOFS 

In this section, we present the detailed proofs of the statements on the recurrence protocol of 
the main text. Since this requires some preparation, we will first define and introduce all tools 
made use of. 


A. Definitions and properties of norms 

1. Properties of norms 


As norms for the states we use the Schatten p-norm, 


X>< (pY 


i/p 


( 55 ) 


where cti(p) is the /1h singular value of p lITil . For the maps we will use norms induced by 
Schatten norms 


= max 


ll(^)IU 


<7 | b 


( 56 ) 


We can now use, that 


i n Tr[cr 2 cri] 11 p 

cri L = max—-—-— with —h - = log = -, 

<r 2 cr 2 9 p q p- 1 


( 57 ) 


and deduce the following lemma. 


Lemma 5 (Norm of the dual channel). Let T be a map. Then, for all r, s > 1 

11 ^ 11 ^ = 11 ^ 11 ^^. ( 58 ) 

ITere we use T' to denote the dual channel, .i.e., the unique channel such that for all A, B we 
have 


tr(AF(S)) = tr (F\A)B). 

Proof. This statement follows immediately from the definition of the norms. We have 

ii 77„ ll^Ollr Tr [a 2 n<n)\ 

|| JA || r ^. s = max —-— - = max max -— --—— 

<Jl 11 ^ S 


01 02 (72 r (Ji U 

r— 1 


( 59 ) 


( 60 ) 


Tr [criJ r t( ( j 2 )] 


Cl C2 (79 r (7l U 
r— 1 


= max max 


= max - 
02 




C7 2 


which is the statement to be proven. 


□ 
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Using Lemma[5j we state that ||J r || 1 _ ) . 1 = ||jTj|oo-K>o- 
Lemma 6 (Restriction to positive operators). Let T be a positive map, then 


max 

a 


IMIi 


= max 
C7>0 


Ha)h 

Ikili 


(61) 


Proof. Assume that the maximum for T is reached by <r max = a + — a~ where a + , cr~ >0 and 
\x{((j + )'' ) = 0. Though this matrix is potentially non-positive (a f 0), we show that there 

always exists a non-negative matrix <r max that also achieves the maximum. A completely positive 
map T preserves positivity, and so 


T (^max) 111 = \\F (ff + ) - T (a~) ||i < IIJ- (<r+) ||i + \\F (a") ||i, (62) 


where we again use the triangle inequality. For a positive Hermitian matrix ||M||i = tr(M), and 
since cr ± are positive and T preserves positivity (it is a completely positive map), we infer 


H-7 7 (ffim) Hi < TV [J>+)] + Tr [j-(<7-)] = TV [J>+) + J>")] 
= TV [T (<r+ + a’)] 

= || T (cj + + a~) ||i. 

Additionally we have 

ll^maxlll = ^ ' |^t (o"n 


= + G h. 


(63) 

(64) 


So we can conclude that 


\F Kax) 111 < \\F (CT + + O ) ||i 


(65) 

ll^maxlll ||cr + + CT ||i 

Therefore, there exists a strictly positive matrix o ' rn . A ^ = a + + a~ that also achieves the maximum 
value. □ 


Corollary 3 (Relationship between norms). If T is a positive map then 

||J-||i^i = ||^t( 1 )|| oo . ( 66) 


Proof Since in the variational definition of the 1 —> 1-norm stated above, the maximum is always 
reached on a positive state, we can use the trace properties and the variational relations of the 
Schatten norms, 


\\F (a) ||i Tr [T (a )] TV [l.F»] 

max —~— -= max —-— -= max--— - 

cr>0 (J 1 cr>0 <T n cr>0 J h 


= max 

ar> 0 

= 11^ (i) 


Tr [P (1) cr] 


Fi 


(67) 

□ 


From this corollary, two further corollaries follow. 

Corollary 4 (Norms for quantum channels). If T is a positive and trace-preserving map then 

imii^i = i- 

Corollary 5 (Norms for conjugations). For a map of the form S (p) = 6 l,/2 /i£ l/2 with = £, we 
have that 


l|S||wi = l|{ 1/2 lf 1/2 ||»=llfll 


( 68 ) 
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2. Bounds of gauge maps 

The condition number is small provided £ is close to 1. Specifically, following Cor. © we can 
upper bound 


ll«S||l-H = 11 £ 11 OC ; 

i«s- 1 ii 1 _ 1 = ur'iio 


(69) 

(70) 


Observe that if £ is close to the identity then £ — 1 is small, and it is helpful to reformulate the 
bounds on £ in terms of A := ||£ — l||oo- We find 


llelloo = 111 - (1 - Olloo < 1 + 111 - £l|oo = 1 + A. 

Similarly, we can deduce 

iii||oo = iir 1 -r 1 (i-£)iioo, 

1> ||£“ 1 ||oo-||£“ 1 ||oo||l-£||oo, 


Ilf 


-ll 


1 

°° - r^A’ 


as well as 


i-rioo = nr i (£-i)iioo < iir i iiooiii-£iioo = 


1 - A' 


We can use the Holder inequality and the fact that ||1 — £ 1 ^ 2 || 0 o A || 1 — £||oo < 1 to find 

|| 1 — «S||t—»-t < 3||1 — £|| 1 _>. 1 = 3A, 

„ „ In 3A 

1-S- 1 1^1 < 


1 - A 


(71) 

(72) 

(73) 

(74) 

(75) 


We can deduce similar bounds for S. If we apply (1 — S) on some state p, we get 

(1 -S)(p) = p- £ 1/2 p£ 1/2 = (1 - £ 1/2 )p + pit - £ 1/2 ) - (1 - £ 1/2 )p(l - £ 1/2 ). (76) 


(77) 

(78) 


These bounds will be used later. 


B. Ergodicity coefficient and fundamental channel 

We make frequent use of another concept originating from the theory of Markov chains, which 
is the ergodicity coefficient. It is a measure for how close a quantum channel is to a projection 
onto its steady state. The ergodicity is defined as follows: If p is the steady state of a completely 
positive map and 


| cr || i = Tr 


(aV) 1 / 2 


is the trace norm of o, then 


i -j~\ 11^) 111 

r \ J- ) = max —-— -. 

Tr[o-tp]=0 IMIl 


(79) 


( 80 ) 








The ergodicity coefficient is similar to the second eigenvalue of the map, and in fact it is straight¬ 
forward to see that it always upper bounds the second eigenvalue. The ergodicity coefficient is 
sub-multiplicative. If two maps T\ and Ti are trace preserving then 

t(TiT2)<t(7i)t(T 2 ). (81) 

So the product of the two maps is at least as close to being a projector as both maps individually. 
Multiplying a large number of quantum channels, each with an ergodicity smaller than unity, then 
eventually leads to a projection. 


C. Preserving the trace 

The observation that AiAi + i = (A i S)(S~ 1 A i+1 ) for any S E GKd. C) means that an MPO is 
not uniquely defined and offers a gauge freedom. Since we deal with unnormalised MPO, we also 
have the freedom of rescaling. There is a canonical gauge and scale corresponding to the transfer 
channel £ being trace preserving. We apply the gauge transformation A t ^-?A' t = SA,S ^ 1 with 
S = 0 i 1 / 2 , where £ is the left Perron eigenstate of £. Consequently, 

E^E' = (£ 1/2 0 £ 1/2 ) E (V 1/2 (8) <T 1/2 ) (82) 

and 

£ M (r 1/2 pr 1/2 ) ?‘ /2 - (83) 

Now we are in a gauge where 1 is the left Perron state, 

£ f (1) i —y £ _1/2 £ (e 1 / 2 !^/ 2 ) £ -1 / 2 = £ _1/2 £ (£) £ -1 / 2 = £- 1/2 ££ _ i/ 2 = 1. (84) 


D. Condition number 

Here we prove the condition number bound presented as Lemma [2] We use Eqs. ( |7T| ) and f74| ) 
to derive a new expression for the condition number 

«(5 n ) = ||£n||oo|IC 1 ||oo< (85) 

1 /A n 

To proceed we need to upper bound A n , which we achieve using the following powerful result. 

Theorem 6. (Eigenvector perturbation theorem) Let T\ and J -2 be completely positive maps with 
spectral radius 1, and T\ is trace-preserving. If £ is the left Perron state for Eo, so that E\ (£) = £, 
then 

iii-eiioo< Yz-fc ( 86 ) 


where 
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We prove this result in the following subsections, but here make direct use of it. We set J-\ = 
A 4 , which inherits the required properties from A n . Likewise, we set 

J- 2 :=A ~\A 4 n + V n ), (88) 

where V n := A' n+2 — A 4 = {A^, B%J + B 4 + (C% + lF() 2 is a perturbation composed of noise 
matrices and A is a normalisation constant ensuring that J ~2 has spectral radius 1. Note that Ti 
differs from A! n+2 by only a constant so they have the same eigenvectors, e.g. £. Therefore, the 
theorem tells us that A n < k n /( 1 — k n ) where 

*» = ( l - T(^) ) 'I- 4 ” ~ 

Since A n is trace-preserving t(A 4 ) = T(A n ) 4 = r n . Looking at the second factor, we collect the 
A 4 terms and use the triangle inequality 

ll^i - ^Ht^i = ||(1 - X~ 1 )A 4 - A-^J^i < |1 - A- 1 |||^||i^i + A’ 1 ||)P,t||(90) 

To proceed, we need information about A, which is the spectral radius of A 4 + V n and so A < 
1 + llntli^i- Furthermore, because A 4 and V n are both positive channels, we know that A must 
exceed the spectral radius of A 4 and so 1 < A. Therefore, A -1 < 1 and |1 — A 1 < | V„ 11 1 ^ i ■ 
Combining these observations, we have 

lin-niii^i <2||p n ||i^i, (9i) 

and so 

k n = 2 {jzS) H^lll^ 1 - (92) 

To upper bound |m!l|l->-i> we have to refer back to the iterative formulae and use norm sub- 
multiplicativity to show 11 "P^ 111 ->-1 < 2e^ + 5e 4 . Substituting this into k n we get 

k -= ( 4 4 + io4), (93) 

which proves Lemma [2] ffowever, the proof rests upon Theorem [6] which we turn to in the next 
section. 


E. Proof of the eigenvector perturbation theorem 

Our methodology for proving Theorem [6] is in the spirit of Ref. ll33Tl . but significantly gen¬ 
eralised so that one of the channels need not be trace-preserving. The proof requires some new 
concepts we have not yet introduced, including the fundamental channel. 

Definition 7 (Fundamental channel). Let T be a channel, then the fundamental channel of J- is 

Z = {l-F + F co )~ 1 . (94) 

This definition is central to the following two lemmas. 
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Lemma 8 (Bound for fundamental channels). T\ and T-i are completely positive maps with spec¬ 
tral radius 1, and T\ is trace-presendng. Z\ is the fundamental channel of T\. The left Perron 
state for J -2 is £. Then 


1-elloc 



|| 1->1 H-^l - || 1—Tl 


(95) 


The following also holds. 


Lemma 9 (Second bound for fundamental channel). Let T he a CPT-map, and denote Z to be the 
fundamental channel of T. It follows that 


2||t-n < 


1 + T(F) 

1 


(96) 


Combining these results straightforwardly leads to Theorem |6j and so the remainder of this 
section will prove these lemmas. 


Proof. We begin by bounding how much the left Perron state is perturbed from the identity. First 
we look at the projectors of our maps in the matrix picture, 


F oo 


Ff° 


I Pi) (1| with 
|p 2 ) (£| with 


(1| Pi) = Tr [Pi] = !, 


(£1 P 2 ) = Tr 



= 1. 


(97) 

(98) 


Since the eigenvector matters only up to a constant we can rescale |£) and \pf) so they still satisfy 
Tr[£tp 2 ] = 1, but also satisfy {pf £) = Tr[p{£] = 1. 

Since we arc dealing with the left eigenvectors we have to transpose our maps for easier nota¬ 
tion. Thus, the transposed projectors are 


(Fi°°)t = |l) (pi| with (1| pi) = 1, 
(F 2 °°) t = |£) (p 2 1 with (£| pi) = 1. 


(99) 

( 100 ) 


We start by applying (z\ ) 1 , the inverse fundamental matrix of transposed T\ to the difference 
between £ and the identity matrix 1. This gives 


(.(1 - 0=(1 - (1 - 0 

= 1 — 1 + 1 — i + f\ (£) - ITr p\i 

= e+4(0- 


( 101 ) 


Going from first to second line, we have used (J-i)'(1) = 1 and (fFf 'f(T) = 1. Going from 
second to third we use the normalisation condition Tr[pj£] = 1 and cancel the identities. It is a 
condition of the lemma that £ = (f ) and so 

(4)" 1 (1-0 = (4-4) (0. 


( 102 ) 
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Now we multiply by z\ on both sides and take the oo-norm, 


(i-o Hoc = \\4°(4 -4) (oi 
\z\ o (tI-4)(0 


(103) 


< 


<||4o(^i t -^2 T )II 0 cw 0 oII£II 


mu 

rt 


■mu 


■u 


2 ) ||oo—>-oo lls II OO 


— Il^l ° (•^’l ) ||oo —;>00 ( 11 3L 11 oo + ||1 £||oo) • 

We now make use of Lemma[5]to deal with the adjoints and use that the identity has unit oo-norm, 
to get 

II (1-0 Hoc < 11(^1 -^ 2)0 Z 1 \\ 1 ^ 1 (1 + 1|1- £|U) (104) 

Now we subtract the term || (F\ — F 2 ) 0 i?i||i_ 5 .i||l — £||oo from both sides, to arrive at 

II (1 - 0 lloo (1 - II {Fl - ?2) O 2l||l-n) < II (Ji-Jb) 0^11^!. (105) 

Assuming that 1 — || (F\ — F 2 ) o ||i_».i > 0, which is true for F\ and JT being close enough 
we divide by 1 — || ( F\ — Ff) o ||i->.i, to get 

l^i II 1—^1 H-^i — ^2 ||i—vi 


rt 


rt'i 


II 1 Oloc < i_ ^H^- 

This completes the proof of Lemma [8] 

Next we turn our attention to proving Lemma[9] 

Proof. First we reformulate the definition of the fundamental channel, 

OO 

\ — 1 ^ ^ ^ T~ T'OO \ k 

k =0 
00 

t + ^2(F k -F°' 


(106) 

□ 


Z(F) = ( 1-F + J r °°)~ 1 = J2(F- F°°) h 


(107) 


k=1 

00 


l + ^2{F-F°°) k (F-F° 

k =0 

00 

t + ^F k {F - F°°). 


k =0 

\k 77 k 


In those steps we used that (F\ — F °°) = F k — F°° and that F°° (F — F °°) = 0. It is important 
to note that for any < 7 , the expression (F — F 00 ) (a) is traceless. This gives 

(l + Efclo ^ o (-T 7 - T 700 )) (o-r) 111 


| Z 11 1 —= max ■ 

<?i 


Ft 1 


( 108 ) 


\ IIJLcLa. ----- “t“ ----- 

ai FI 1 Ft 1 


< 1 + max 
01 


£r=0^ (T 7 F°° ((7"!)) ||l II F - F°° (<Ti) 


\F - F°° («7i) ||i 


Ft 1 


, IIE^O^) 111 || J 7 — F°° (<7l) ||! 

< 1 + max -— 2 — -max ■ 


Tr[<J2]=0 


F2 1 


Ft 1 
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We now upper bound both terms separately. We note that T — F°° = T (1 — J 700 ). We therefore 
get 


max 


||(J~- J-°°) (oi) Jji 

||CT1 ||l 


= max • 

(71 


||jr(i_jr°o( (T i))|| i || 1 _^( <7l )|| 1 


< max 
Tr[o r 1 ]=0 


|i - j 700 Oi) ill 
II J 7 (1 — J 700 (<Ti)) | 

||1 - T°° (cri) ||i 


max - 

(72 


1^1 ||l 

11(1 


(d) 


Fi l 


< r (F) ||1 - J 700 1|< r (T) (||l||i_,i + ||^°°||i^i) 

< 2r (J 7 ). 


(109) 


Here, we have used the fact that both 1 and F°° arc trace-preserving and completely positive and 
Cor. (|4|). Hence, 




max - 
Tr[cr2]=0 


Thus we can conclude that 


Z _ J r 


max 


\F k (o-i) 111 


I 0 " 2 !! 1 £^Tr[oi]=o iFilli 




k =0 


1 


k =0 


1-rl L: 


< 1 + 


1 -t(F)' 
2r (J 7 ) 1 + r (F) 


1 ~r(F) 1 -r(jy 


This completes the proof of Lemma [9] 


(HO) 


(HI) 

□ 


F. Change of the ergodicity 


After re-gauging, the ergodicity coefficient changes. This change can be bounded in the fol¬ 
lowing way. 

Lemma 10 (Bounding the change of the ergodicity coefficient). Let T\ and To be trace- 
presen’ing, c.p. maps. Then 


\t(F 1 )-t(F 2 )\<t(F 1 -F 2 ). (112) 

Proof. Without loss of generality we assume r (F\) to be larger than r (F 2 ). We then have 

II 7, LtAI|i _ lil^ Mlli (113) 


— max 
Tr[(Ji]=0 

||cxi ||l 

max - 
Tr[cr 2 ]=0 

< max 
Tr[<7i]=0 


11^2 (T) 

IkilU 

IkiHi 

< max 
Tr[<7i]=0 

11(^1- ft) (*l) 111 

o-ii 



= r [T\ — F 2 ). 


□ 
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Next, we still need to bound the ergodicity coefficient for S n (At + V n ) S n 1 . For this purpose 
we use Lemma [T0| (in the second step), to get 

T{S n \Ai + V n )S- 1 ) 

< T {-An) + | T (At) ~ T (S n (At + V n ) S^) | 

< T (An) 4 + T (At ~ S n (At + V n ) S n *) . (114) 

Subsequently, we bound the second term, 

At - Sn (At + Vn) Sn 1 = - 1 - SnVnSn 1 (115) 

= (1 — S n )At + S n At( 1 — 5 n 1 ) — S n V n S n 1 . 


Next, we use the fact that t (V\ + J-2) < Tp 7 !) + t (V2) and that t (V) < ||J r ||i_ > i in the 
following steps, 

r (At - Sn (At + Vn) S- 1 ) <t(( 1 - S n )At) + t (S n At( 1 - O) d 16) 

+ T (S n V n S n 1 ) 

< r (1 - S n )T (At) + \\S n At(l - S- 1 ) II 1->1 

+ \\SnV n S n 1 1| 1—5-1 

— I|H — ^nlll—s-l 7 ” (An) 

+ ll^n || 1—5-1 H'd-n II1—5-1 ||H — <S n 1 1| 1—^1 
+ ||<5n || 1—5l ll^n, 111—»1 ||<5 jj 1 111—>■ 1 - 


Using Eqs. ( |7T| ) to ( f78| ) as well as ( [85] ), we conclude that 
r (At - s n (At + Vn) S” 1 ) < T ( An ) 4 3A n + 


1 + \oA | 1 M-p II 


(117) 


Therefore, 

T (Sn (At + Vn) Sn 1 ) < T (A* n ) (1 + 3A n ) + (3A n + ||p„||l^l) . (118) 

1 LA n 

Next, we can upper bound the new ergodicity r (A n + 2 )- to get 


T (A n + 2) = T 


Sn (At + V n ) S\ 


-1 

n 


1 + Sr 


<r(S n (At + V n )S- 1 ). 


(119) 


It follows, from assuming the worst case of a growing ergodicity, that 

T (A n + 2 ) A r (An) (1 + 3A n ) + -—— (3A n + ||'Pn||l-).l) . 

1 l\ n 


( 120 ) 









23 


G. Simplifying parameter space 

We present our bounds making use of the abbreviations r n :=r (A n ), Z n :=(1 + r„)/( 1 — 4) 
and :=11 P n \| l —>1 • These quantities can be bounded as follows, 


Pn < 2e^ + 5e^, 


A„ < 


2 Z n P n 
1-2 Z n P n ’ 


T n+ 2 < 4 (1 + 3A n ) + -— (3A n + P n ) , 


1 -A„ 


Cn+2 < 4 


1 +A 


n (4 + 4 ) 


1 - A r 

To derive a manageable convergence threshold for eo depending To, we choose the ansatz 

Zn^n Z —. 


( 121 ) 

( 122 ) 

(123) 

(124) 

(125) 


Next, we show that given ( |125| ) for all n, e n converges to zero. Subsequently, we demonstrate that 
if (|125|) is fulfilled at step n, it is also fulfilled for n + 2. First, we assert that (|125[) implies 


Now we can upper bound P n 


n-4 1 

f < _ r ± < _ 

n “ 71 + t* - 7' 


Pn < (2 + 5<4) e 2 n < —e 2 n . 


Furthermore 


97 p oil 7 2 2-il 

^ ^ 5 Zj n t n < 5.7 


1 - 2Z n P n ~ 1 - - 1 - 


22 e 7 

35 e?l ^ 

1 _ 22 - 10 6 ^ 
1 35 tn 


is true for all e n < i and leads us to 


(126) 


(127) 


(128) 


1 + A n 31 

r^A :- 1 + 2 o e?i - 


(129) 


Finally, we can upper bound e n+ 2 as 


e n+2 < 4 ^1 + — e n ^ (l + 4 ) < 5 e n- 


(130) 


This converges to zero for eo <1 which is implied by ( | 1 25| ). Next, we show that if ( |125| ) is true 
for step n it also holds for step n + 2. We look at the update rule of r n , (1231, and insert the bounds 
dl27b, d 1 28b and dl29b, to find 


20 7 V 10 


Tn+2 < ^ (l + ^e„) + (l + ll) | £ 

21 

< T n + T nYQ €n + 36n- 


(131) 
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Inserting ( 126|) yields 


.4 , ,4 3 1 — < 31 — 7-4 

n 101 + r 4 71 + r 4 ' 


( 132 ) 


We investigate two different cases, \ <r n < 1 and 0 < t„ < l. In the first regime, we know 


^ ^ ^4 , ^4 

T n +2 < T n + T n 


3 1 ~ r n 3 1 — r 4 
10 1 + r 4 71 + r 4 


< T n . 


(133) 


Therefore, if (1251 is fulfilled, r n+ 2 < r n and from (130) we know e n+ 2 < e n . If e n and r, 


satisfy ansatz ( |125| ), than e n+ 2 and r n+ 2 necessarily do the same, since they are both smaller. In 
the second regime we know 


_ ^ 4 1 4 

+i+2 — +i T +i 


3 1 - r 4 


3 1 - r 4 


+ 


1 

< 


n n 10 1 + r 4 7 1 + r 4 — 2 ’ 
so once r n is smaller than 7 it stays smaller than 7. Therefore, 

■^n+ 2 + 1+2 < — 5e n < ——e n < —, 


(134) 


(135) 


and the ansatz holds true in step n + 2. By induction we can conclude that if ( |125[ ) is satisfied for 
n = 0 it stays satisfied for all n = 2k with k G H+ The assumption for step n = 0 can be 
reformulated in the form 


eo < + 


11 -r 4 


71 + r 4 ' 


The speed of the convergence is given by 


- 1 /r \ 2+ 2 

< - (5e 0 ) 

5 


(136) 


(137) 


This concludes the derivation of a threshold for the recurrence protocol that ensures convergence 
to maximally entangled pure states. 


IV. BOUNDING OBSERVABLES 

So far we always dealt with unnormalised MPOs, since it allows the explicit description of the 
MPO matrices independent of the length of the MPO. An MPO with the same matrices but of a 
different length will generally have a different norm. We now turn to showing that if the transfer 
matrix is trace-preserving the norm of an MPO will be exponentially close to unity in L, the length 
of the chain. We start by stating a helpful lemma. 

Lemma 11 (Trace bound). If E is isomorphic to £, a trace-presenting completely positive map, 
then 


1 - d 5/2 r (£) l < TV [E l ] < 1 + d 5/2 r (« S) L , (138) 


where r {£) is the ergodicity coefficient of £. 
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Proof. The proof is based on the idea to write the trace explicitly in the matrix picture for E as 
a sum over a basis. We chose the generalised Pauli matrices P, which vectorised and normalised 
will give us a convenient basis for the trace. We use that except for the identity, all these matrices 
have trace equal to zero. 


Tr [E l ] = (td~ l/2 


E. 


Ln ~ 1 \ 


1 + ^ { a \ Eri 

o-eP/i 


(7). 


Now we change into the channel picture to use the ergodicity coefficient. 

1 

d L ' J Tr[<r]=o Mail 


[*( 1 ) 1 -™ 


(139) 


(140) 


Next, we can use that the identity is a fixed point of £f We can freely lower bound by changing 
from 11 cr1 1 2 to Halloo, but we have to introduce a factor d 1 / 2 to replace 11 a 1 1 2 with 11 a 1 1 1 . This gives 


Tr [E l ] 


> 

1 

— d 2 Vd max 




Tr[cr]=0 

> 

1 

-d 5 / 2 

max 




Tr[o’i]=0 r 

> 

1 

-d 5 / 2 

max ■ 




Tr[cr 2 ]=0 

> 

1 

- d 5 / 2 

t(£) l . 


Tr [oi£ l (a 2 )] 


l|£ L fa) 

Mi 


(141) 


The upper bound can be achieved analogously. 


□ 


We now show that it is possible to upper bound the expectation value of an Hermitian operator 
with its 1-to-l norm. The transfer matrix of an operator supported on r sites is denoted as E r 0 , and 
the isomorphic channel as £' 0 . 


Lemma 12 (Bounds to transfer operators). If£ is a trace-preserving completely positive map and 
£q is an operator channel supported on r sites, then 


Tr [E r 0 E L - r ] 1 + d 5 / 2 T (, £) L - r 

Tr [EL] - 11 ol|i ^ i- d ^ T {£) L 


(142) 


and by setting r = 1 and Eo = A,B,C,D we find the local fidelity w.r.t. the respective Bell 
states. 


Proof. We start from 

Tr [E r 0 E L ~ r ] = (ld~ 1/2 


E r 0 E L - r \ld~ 1 / 2 )+ (a\E r 0 Et r \ 

O-gP/l 


a 


As before, we switch to the channel picture to deal with the first term. 


IcT 1 / 2 


r zrL n -r\ lrf -l/2\ = 1 tv [pr cL-r ^ M _ Tl ' [ £ 0^ L ' ( 1 )] 


EbK 


l ) = -Tr[£ r 0 £ L ~ r (l)\ 

< II sb^- 1 

< \\Fo\\i-+i\\£ 


Tr [1] 


11-5.1, 
’P-1 I 


1 ^ 1 , 


(143) 


(144) 


1-5.1- 
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The second term will be handled accordingly. In this way, we find 


E 

o-eP/i 


Lr| a) < d 2 max (a| EqE i? 1 | 
“ Tr[<r]=0 ' U 1 


<J) 


(145) 


,2 Tr [aS^- 1 (*)] 

= a max ----—-- 

Tr[a}=0 \\a\\$ 

j2 Tr (a)] 

= d max - b -- —ttt;-- 

Tr[er]=0 


<7 


< d 2 max max 
Tr[cri]=0 Tr[cr 2 ]=0 


Tr [a^S^- 1 M 


IF1IIOOII02II2 

In the last step we made again use of Halloo < 11CT 2 11 2 - We can hence conclude that 


E H E b E “ 

xSP/1 


Lr 


, , 2 || £ r n £% n ~ l (a 2 ) 

a < d 2 max " 0 n , — 
Tr[<T2]=0 


= d 2 Vd 


max 


W2W2 

Wofr- 1 (^ 2 ) 


(146) 


ill^-'MIli 


Tr[<7 2 ]=0 H^- 1 

= d 5/2 ||<?olli^i ms 

Tr[cr 2 j=u 
= d 5 / 2 \\£ r 0 \\i^iT(£) L - r 


(* 2 ) lit 

II#-- 1 M 

II <72 II 1 


P2 1 


For the denominator we use the upper bound from Lemma [IT] and this completes the proof. □ 
Alas, lower bounding in the same fashion is not generally possible without knowing more 


about the eigenvectors of E r Q . Lemma 12 tells us that upper bounds for physical expectation 


values can easily be derived by looking at the channel norms, given a long chain and that the 
transfer matrix is trace-preserving. This raises the question whether the transfer matrix remains 
trace-preserving over the course of the iteration. This is only the case for deterministic protocols 
without post-selection since no matrices arc disregarded and thus the transfer matrix £ n is a power 
of the previous transfer matrix. Consequently, the proof of a threshold was much more straight 
forward in the deterministic case. 


V. PHYSICAL MODEL 


The machinery presented above is applicable to large classes of natural preparations that give 
rise to correlations resulting from memory effects. We have seen in what way generically the 
entanglement distillation protocols can be successfully applied in the presence of correlations. 
In this section, we present a physical model that highlights the functioning of the scheme for a 
particular choice of a Hamiltonian that reflects a memory effect. This is not meant to be a particu¬ 
larly feasible model - this will of course depend on the physical architecture, and our formalism is 
applicable to all those scenarios. It is rather meant as a paradigmatic example, to stress the general 
functioning of the scheme. We start with the following set of initially uncorrelated Werner states 
with Fq, 

po = Fq 4> + + F ° (<jr + V> + + </’-) • 


3 


(147) 
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These states subsequently undergo a unitary interaction with a memory bit on Bob’s side, 
U ( t , J) := exp ( itH ), for J > 0, with 

fF = J(X<g>X + Y<g>Y + Z<g)Z) + Z<g>l + l<g>Y. (148) 

We further implement a de-phasing channel for a forgetful memory, which is applied to the mem¬ 
ory in-between two interactions with Bob’s qubit, 

D c (a) = (1 - c D ) a + c D t (149) 

for a suitable cp 6 [0,1]. This procedure is depicted in Fig. 5. It should be clear that while this 
is a particular example of a Hamiltonian and dissipative map chosen, this feature of having an 
interplay between Hamiltonians reflecting interactions with memory and dissipative channels is 
generic. 



FIG. 5. MPO diagram of the process generating our physical example. 


We compare the performance of the recurrence protocol on (i) sequentially prepared states with 
implemented memory with (ii) perfect memoryless i.i.d. distributions of the same local fidelity. As 
a measure of how a specific memory setting performs we introduce the notion of relative noise 


In = 


1 - K MPO 


1 - F 1 

J- J- ri 


i.i.d. 


(150) 


after n rounds of iteration. 

For certain parameters, e.g. Fq = 0.9, .7=1, cp = 0.04 and t = 0.1, the MPO setting 
converges significantly faster than the i.i.d. setting of the same local fidelity. After one round we 
have 71 = 0.9, meaning that after one round we have only 90% of the noise compared to the i.i.d. 
case. We included this specific MPO in Fig. [3] For longer interaction times t = 0.47 we get a local 
fidelity of < 0.4. An i.i.d. setting with this fidelity does not succeed, but the MPO setting does. 
This shows that we transport the unwanted inter pair correlations introduced by the memory into 
the wanted correlations between the pairs. We tested our Heisenberg memory model for different 
parameters and the distillation of the correlated states performs better than the distillation in the 
i.i.d. case for a large range of interaction times, see also Fig. 6 . 
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FIG. 6. Relative noise after three steps of iteration for different initial fidelities Fq and interaction parameter 
J with cd = 0.04. Whenever the relative noise is smaller than one, we see a comparable advantage over 
the i.i.d. case. 















